Using STFT real and imaginary parts of modulation signals for MMSE-based speech enhancement

نویسندگان

Belinda Schwerin

Kuldip K. Paliwal

چکیده

In this paper we investigate an alternate, RI-modulation (R = real, I = imaginary) AMS framework for speech enhancement, in which the real and imaginary parts of the modulation signal are processed in secondary AMS procedures. This framework offers theoretical advantages over the previously proposed modulation AMS frameworks in that noise is additive in the modulation signal and noisy acoustic phase is not used to reconstruct speech. Using the MMSE magnitude estimation to modify modulation magnitude spectra, initial experiments presented in this work evaluate if these advantages translate into improvements in processed speech quality. The effect of speech presence uncertainty and log-domain processing on MMSE magnitude estimation in the RI-modulation framework is also investigated. Finally, a comparison of different enhancement approaches applied in the RI-modulation framework is presented. Using subjective and objective experiments as well as spectrogram analysis, we show that RI-modulation MMSE magnitude estimation with speech presence uncertainty produces stimuli which has a higher preference by listeners than the other RI-modulation types. In comparisons to similar approaches in the modulation AMS framework, results showed that the theoretical advantages of the RI-modulation framework did not translate to an improvement in overall quality, with both frameworks yielding very similar sounding stimuli, but a clear improvement (compared to the corresponding modulation AMS based approach) in speech intelligibility was found. 2013 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech enhancement using STFT of real and imaginary parts of modulation signals

This paper investigates an alternate modulation (RImodulation) AMS-based framework for speech enhancement, in which real and imaginary parts of the modulation signal are processed in secondary AMS procedures. We propose to apply MMSE magnitude estimation in this framework, and using subjective experiments, show that MMSE RI-modulation magnitude estimation produces stimuli which is preferred by ...

متن کامل

Supergaussian Garch Models

In this paper, we introduce supergaussian generalized autoregressive conditional heteroscedasticity (GARCH) models for speech signals in the short-time Fourier transform (STFT) domain. We address the problem of speech enhancement, and show that estimating the variances of the STFT expansion coefficients based on GARCH models yields higher speech quality than by using the decision-directed metho...

متن کامل

Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models

In this paper, we develop and evaluate speech enhancement algorithms, which are based on supergaussian generalized autoregressive conditional heteroscedasticity (GARCH) models in the short-time Fourier transform (STFT) domain. We consider three different statistical models, two fidelity criteria, and two approaches for the estimation of the variances of the STFT coefficients. The statistical mo...

متن کامل

Enhancement and Recognition of Reverberant and Noisy Speech by Extending Its Coherence

Most speech enhancement algorithms make use of the short-time Fourier transform (STFT), which is a simple and flexible time-frequency decomposition that estimates the short-time spectrum of a signal. However, the duration of short STFT frames are inherently limited by the nonstationarity of speech signals. The main contribution of this paper is a demonstration of speech enhancement and automati...

متن کامل

Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty

In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Speech Communication

دوره 58 شماره

صفحات -

تاریخ انتشار 2014

Using STFT real and imaginary parts of modulation signals for MMSE-based speech enhancement

نویسندگان

چکیده

منابع مشابه

Speech enhancement using STFT of real and imaginary parts of modulation signals

Supergaussian Garch Models

Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models

Enhancement and Recognition of Reverberant and Noisy Speech by Extending Its Coherence

Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty

عنوان ژورنال:

اشتراک گذاری